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(54) Method and computer program product for subjective image content similarity-based 
retrieval 



(57) A method for learning a user preference for a 
desired image, the method comprises the steps of using 
either one or more examples or counterexamples of a 
desired image for defining a user preference; extracting 
a relative preference of a user for either one or more 
image components or one or more depictive features 



from the examples and/or counterexamples of desired 
images; and formulating a user subjective definition of a 
desired image using the relative preferences for either 
image components or depictive features. 
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Description 

BACKGROUND OF THE INVENTION 

[0001] As more and more information is available 5 
electronically, the efficient search and retrieval of rele- 
vant information from vast databases becomes a chal- 
lenging problem. 'Given an image database, selection of 
images that are similar to a given query (or example) 
image is an important problem in content-based image 
database management. There are two main issues of 
concern in the design of a technique for image similar- 
ity-based retrieval: image representation, and image 
similarity. Image representation is concerned with the 
content-based representation of images. Given a con- 
tent-based image representation scheme, image simi- 
larity is concerned with the determination of 
similarity/dissimilarity of two images using a similarity 
measure based on that representation. Both image con- 
tent and image similarity are very subjective in nature. 
User preference/subjectivity in a multimedia retrieval 
system is important because for a given image, the con- 
tents of interest and the relative importance of different 
image contents are application/viewer dependent. Even 
for a single viewer or application, the interpretation of an 
image's content may vary from one query to the next. 
Therefore, a successful content similarity-based image 
retrieval system should capture the preferences/subjec- 
tivity of each viewer/application and generate 
responses that are in accordance with the prefer- 
ences/subjectivity/ 

[0002] Almost all existing commercial and academic 
image indexing and retrieval systems represent an 
image in terms of its low-level features such as color 
and texture properties, and image similarity is meas- 
ured in the form of: 

S(U) = X / w / *0 Fy (/,J). / = 1 N 

where S(l, J) is a function which measures the overall 
image similarity between images / and J, each image is 
represented in terms of N features, F h J={1 , N} , 
D Fi (i, J) is a function for computing the similarity/differ- 
ence between image / and J based on the feature F h 
and Wj is the weight, or importance, of feature F ( in the 
overall image similarity decision [see W. Niblack, R. Bar- 
ber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. 
Yanker. D. Faloutsos, and G. Taubin, "The QBIC Project: 
Querying Images By Content Using Color Texture, and 
Shape", SPIE Vol. 1908, 1993, pp173-187; U.S. Patent 
5,579,471, R. J. Barber et al. t "Image Query System 
and Method", 1996; M. Strieker and M. Orengo, "Simi- 
larity of Color Images", SPIE Vol. 2420, 1995; S. Santini 
and R. Jain, "Similarity Queries in Image Databases", 
QVPR, 1996; J. Smith, and S. Chang, "Tools and Tech- 
niques for Color Image Retrieval", SPIE Vol. 2670, 
1996; U.S. Patent 5,652,881, M. Takahashi, K. Yanagi, 
and N. Iwai, "Still Picture Search/Retrieval Method Car- 



ried Out on the Basis of Color Information and System 
For Carrying Out the Same", 1997; J,K. Wu, A.D. Nar- 
asimhalu, B.M. Mehtre, CP Lam, Y.J. Gao, "CORE: a 
content-based Retrieval Engine for Multimedia Informa- 
tion Systems", Multimedia Systems, Vol. 3, 1996, pp25- 
41; W.Y Ma, "NETRA: A Toolbox for Navigating Large 
Image Databases", Ph.D. Dissertation, UCSB, 1997; Y. 
Rui, S. Mehrotra, and M. Ortega, "A Relevance Feed- 
back Architecture for Content-based Multimedia Infor- 
mation Retrieval Systems", IEEE Workshop on 
Content-based Access of Image and Video Libraries, 
1997, pp82-89], 

[0003] In most of the systems cited above, either the 
weight w, for each feature F, is fixed, or the user manu- 
ally provides the value to indicate his/her preferences 
regarding the relative importance of that feature. To a 
normal user, the different features and weights gener- 
ally do not intuitively correlate to his/her interpretation of 
the query image and the desired query results. For 
example, some systems require the user to specify the 
relative importance of features such as color, texture, 
structure, and composition for processing a query. To an 
average user, what is meant by these different features, 
and what weights to assign to each one in order to 
obtain desired results is definitely unclear. The optimum 
combination of weights to use for a specific query 
toward a specific goal is highly dependent on the image 
description scheme and the similarity measure used by 
the system, and is not readily understood by the aver- 
age user. 

[0004] Recently, a few approaches have been pro- 
posed in order to overcome the above mentioned prob- 
lems. These approaches require the user to identify a 
few relevant images from the query response. The set 
of relevant images is processed to automatically deter- 
mine user preferences regarding the relative impor- 
tance of different features or the preferred distance 
measure. One such approach was proposed in [Y. Rui, 
S. Mehrotra, and M. Ortega, "A Relevance Feedback 
40 Architecture for Content-based Multimedia Information 
Retrieval Systems", IEEE Workshop on Content-based 
Access of Image and Video Libraries, 1997, pp82-89]. 
Given a query, multiple ranked response sets are gener- 
ated using a variety of representations and associated 
45 similarity measures. The default response set to the 
query is displayed. The user selects a few relevant 
images and provides their ranking. The response set 
that best matches the set of ranked relevant images is 
then selected as the preference-based query response. 
so The major shortcomings of this approach are: (i) user 
preference cannot be specified without first processing 
a query; (ii) for a small set of ranked relevant images, 
the final response set may not be unique; and (iii) rela- 
tive importance of individual components of a image 
55 representation cannot be modified based on the set of 
relevant images. 

[0005] Another approach to user preference-based 
query processing is to arrange all database images in a 
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virtual feature space. Users can "sift through" the differ- 
ent subsets of feature spaces and identify the desirable 
feature set for query processing [A. Gupta, S. Santini, R. 
Jain, "In Search of Information in Visual Media", Com- 
munications of the ACM, Vol. 40, No. 12, December, 5 
1997, pp35-42]. While this approach alleviates the bur- 
den of fixed feature weights, the lack of correlation 
between user interpretation of image similarity and 
associated feature-based representation still remains. 
Furthermore, since complete global image represents- 10 
tions are used for images with multiple subjects or 
regions of interest, a subset of feature space that corre- 
sponds to user preference may not exist. 
[0006] A natural approach to capture a user's prefer- 
ences for image similarity computation is to automati- 15 
cally extract/derive such information from the user 
supplied positive examples and negative examples of 
desired images. The derived preferences can then be . 
used to automatically determine similarity measures. 
An existing system, "society of models", of Minka & 20 
Picard [TP. Minka and R.W. Picard, "Interactive Learn- 
ing with a Society of Models", Pattern Recognition, Vol. 
30, 1997, pp565-581], adopts this approach. The sys- 
tem employs a variety of feature-based image represen- 
tations and associated similarity measures to generate 25 
several different similarity-based hierarchical clusters of 
images in a database. The user supplied positive and 
negative examples are used to identify the image clus- 
ters preferred by the user. All images in the preferred 
clusters form the set of images desired by the user. The 30 
image clusters can be dynamically adapted based on 
user supplied examples of desired or undesired images. 
This process of modifying clusters is very time consum- 
ing for large databases. Another system called NETRA 
[W.Y. Ma, "NETRA: A Toolbox for Navigating Large 35 
Image Databases". Ph.D. Dissertation, UCSB,,1997] 
also utilizes feature similarity-based image clusters to 
generate user preference-based query response. This 
system has restricted feature-based representations 
and clustering schemes. The main drawbacks of both 40 
these system are (i) the database is required to be static 
(i.e., images cannot be dynamically added or deleted 
from the database without complete database re-clus- 
tering); (ii) usually, a large number of positive and nega- 
tive examples need to be provided by the user in order 45 
for the system to determine image clusters tat corre- 
spond to the set of desired images. 

SUMMARY OF THE INVENTION 

50 

[0007] The present invention proposes a general 
framework or system for user preference-based query 
processing. This framework overcomes the shortcom- 
ings of the existing approaches to capture and utilize 
user preferences for image retrieval. 55 
[0008] An object of this invention is to provide a gen- 
eralized user-friendly scheme to automatically deter- 
mine user preferences for desired images and to 



perform preference-based image retrieval. 
[0009] A second object is to provide a system for user 
preference-based image retrieval from a dynamic data- 
base of images. That is, images can be added/deleted 
from the database dynamically without requiring com- 
plete database reorganization. 

[0010] A third object is to provide an approach to effi- 
ciently determine the relative importance of individual 
components of the image representation scheme from 
user supplied examples and counterexamples. 
[0011] These and other objects will become clear in 
the following discussion of the preferred embodiment. 
[0012] Briefly summarized, according to one aspect of 
the present invention, the invention resides in a method 
for learning a user preference for a desired image, the 
method comprising the steps of: using either one or 
more examples or counterexamples of a desired image 
for defining a user preference; extracting a relative pref- 
erence of a user for either one or more image compo- 
nents or one or more depictive features from the 
examples and/or counterexamples of desired images; 
and formulating a user subjective definition of a desired 
image using the relative preferences for either image 
components or depictive features. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0013] 

Fig. 1 is a perspective drawing of a computer work- 
station that may be used for implementing the 
present invention; 

Fig. 2 is a flowchart of an overview of the image 
registration phase of the present invention; 
Fig. 3 is a diagram illustrating an image and its 
associated sub-levels that are derived from the 
image all of which are used by the present inven- 
tion; 

Fig. 4 is a diagram illustrating a plurality of specific 
features at the feature level of the image; 
Fig. 5 is a diagram illustrating the sub-levels of the 
color feature of the image; 

Fig. 6 is a diagram illustrating the sub-levels of the 
texture level of the image; 

Fig. 7 is a flowchart illustrating a software program 
of the present invention for obtaining user prefer- 
ences; 

Fig. 8 is an alternative embodiment of Fig. 7; 
Fig. 9 is a flowchart illustrating the selection of 
images from a database using the present inven- 
tion; 

Fig. 10 is a flowchart illustrating a summary of a 

portion of the present invention; and 

Fig. 11 is a flowchart illustrating a summary of the 

invention. 
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PREFERRED EMBODIMENT 



[0014] In the following description, the present inven- 
tion will be described in the preferred embodiment as a 
software program. Those skilled in the art will readily 
recognize that the equivalent of such software may also 
be constructed in hardware. Consequently, the term 
"computer program product" as used herein is equiva- 
lent to an electronic system containing hardware to 
product the result of the software. 
[001 5] Still further, as used herein, computer readable 
storage medium may comprise, for example; magnetic 
storage media such as a magnetic disk (such as a 
floppy disk) or magnetic tape; optical storage media 
such as an optical disc, optical tape, or machine reada- 
ble barcode; solid state electronic storage devices such 
as random access memory (RAM), or read only mem- 
ory (ROM); or any other physical device or medium 
employed to store a computer program. 
[0016] Referring to Fig. 1 , there is illustrated a compu- 
ter system 10 for implementing the present invention. 
Although the computer system 10 is shown for the pur- 
pose of illustrating a preferred embodiment, the present 
invention is not limited to the computer system 10 
shown, but may be used on any electronic processing 
system. The computer system 10 includes a microproc- 
essor-based unit 20 for receiving and processing soft- 
ware programs and for performing other processing 
functions. A display 30 is electrically connected to the 
microprocessor-based unit 20 for displaying user- 
related information associated with the software. A key- 
board 40 is also connected to the microprocessor based 
unit 20 for permitting a user to input information to the 
software. As an alternative to using the keyboard 40 for 
input, a mouse 50 may be used for moving a selector 52 
on the display 30 and for selecting an item on which the 
selector 52 overlays, as is well known in the art. 
[0017] A compact disk-read only memory (CD-ROM) 
55 is connected to the microprocessor based unit 20 for 
receiving software programs and for providing a means 
of inputting the software programs and other informa- 
tion to the microprocessor based unit 20 via a compact 
disk 57. which typically includes a software program. In 
addition, a floppy disk 61 may also include a software 
program, and is inserted into the microprocessor-based 
unit 20 for inputting the software program. Still further, 
the microprocessor-based unit 20 may be programmed, 
as is well know in the art, for storing the software pro- 
gram internally A printer 56 is connected to the micro- 
processor-based unit 20 for printing a hardcopy of the 
output of the computer system 10. 
[0018] Images may also be displayed on the display 
30 via a personal computer card (PC card) 62 or, as it 
was formerly known, a personal computer memory card 
international association card (PCMCIA card) which 
contains digitized images electronically embodied in the 
card 62. The PC card 62 is ultimately inserted into the 
microprocessor based unit 20 for permitting visual dis- 



play of the image on the display 30. 
[0019] This invention presents a user preference 
based query processing system framework that adap- 
tively retrieves database images according to different 
5 user's notion of similarity. The overall system consists of 
three functional phases: image registration phase, user 
preference understanding phase, and preference- 
based image retrieval phase. 

[0020] In the image registration phase of the sys- 
10 tern/framework, the database is populated with images. 
Referring to Fig. 2, as an image 10 is inserted (or regis- 
tered) into the database of the computer system 10, a 
set of feature extraction techniques is applied S110 to 
the image 10 to extract all the relevant information, or 
15 metadata, needed for representing the image. Those 
skilled in the an will recognize that a variety of methods 
exist for extracting image features such as color con- 
tents, texture contents, regions, and boundaries, etc. 
For example, referring to Fig. 4, color 20a, texture 20b, 
20 shape 20c, composition 20d, and structure based 20e 
image features are extracted to represent an image 10. 
The next step is to represent the image 1 0 in terms of its 
extracted f eatu res S 1 20 . 

[0021] Referring to Fig. 3, in the preferred embodi- 
25 ment, an image 10 is represented as a set of multiple 
multi-level feature-based representations 20. For 
instance, an image I is represented as 
/ = {i Fp i F2 , / FN } , where l Fi is image / represented 
in terms of feature F h where / is in range [1..N]. Each l Fi 
30 is a hierarchical representation consisting of one or 
more levels. The highest level of the representation of 
l Fi is of the form I Fi = {C IFi 1 , C IFi 2 , .... C IFi } where 
C fF j (30) denotes the y th component of t Fi . Each C IF f 
can in turn be described by its set of sub-components. 
35 At the lowest level of a feature-based image representa- 
tion, each component or sub-component is represented 
by a set of attribute values 40. 

[0022] In the preferred embodiment, a number of mul- 
tilevel feature-based representations are employed 
40 (examples of two such representations are shown in 
Figs. 5 and Fig. 6). Those skilled in art will recognize 
that any number of feature-based representations can 
be used to generate such an image representation hier- 
archy. Referring now to Figs. 5 and 6, for example, in the 
45 preferred embodiment of image representation using 
color feature 20a, a color-based representation of an 
image is of the form l c = {C v C 2 , C K } where q 30 
denotes the ; th dominant color segment in the three- 
dimensional color space spanned by image /. Each Cj is 
so in turn described by one or more of its attribute sets 40. 
In the preferred embodiment, one attribute set 40 
describes the color space segment in terms of its color 
range 40a, which can be expressed either as dimen- 
sions of an enclosing box, or an enclosing ellipsoid. 
55 Another attribute set describes the color space segment 
in terms of its color moments 40b, such as color mean, 
variance, and skewness. Yet another attribute set 
describes the color segment by the color distribution 
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40c within the segment. Those skilled in the art will rec- 
ognize that any number of other metrics can be used to 
describe the segment. Referring back to Fig. 2, the final 
step of the image registration process is to store input 
image 10 and its associated representations into the 
database S130. Those skilled in art will recognize that 
several ways for organizing images and their represen- 
tations in a database are possible. For example, flat files 
can be used to organize image representations and 
associated links to image files. Multidimensional spatial 
and point access methods can also be utilized to organ- 
ize the feature-based image representations. In the pre- 
ferred embodiment, R-tree based indices are used to 
organize the image representations. The populated 
database can then be queried to retrieve desired 
images. 

[0023] The user preference understanding phase of 
the system/framework is concerned with automatic 
extraction of user preferences for desired images via 
user interactions. There are two options to acquire user 
preferences: 

[0024] Query-based option - This option performs a 
query with default preferences to provide a response 
set. If needed, a user can then provide one or more 
examples and/or counterexamples of desired response 
images from the response set. A preferred embodiment 
of option 1 is shown in Fig. 7. An image is first selected 
by the user as the query image S310. Next, images sim- 
ilar to the query image are retrieved using default pref- 
erences S320. A representation-based similarity 
measure is used to identify and retrieve images that are 
similar to the query image. Therefore the similarity 
measure used by a system is dependent on the under- 
lying image representation scheme. In general, the sim- 
ilarity measure is of the form S(A, B) = Sim(R A , R B ) 
[0025] Where S(A, B) denotes the similarity of images 
A and B having representations R A and R B , respec- 
tively. Sim( ) denotes the representation-based similar- 
ity measure. For a multi-feature-based image 
representation scheme, where an image I is repre- 

K F1 ^ F2 „ FN, 



the segment's bounding range dimension attribute set 
can be computed as the degree of overlap of the two 
bounding regions. The similarity of two color space seg- 
ments represented as color moments can be computed 
5 as the Euclidean, absolute value distance, or Maha- 
lanobis distance. The similarity of two color space seg- 
ments represented by color distributions can be 
determined using one of the following distance metrics: 



sented as R t - {R 
can be of the form 



R i 



■ R, }. Sim(R A , R B ) 



Sim(R A ,R b ) = Z i w i * D Fi (R A ,R B ), i = 1. 



where Dfi(Ra> r b) denotes the similarity of image rep- 
resentations R A Fi and R B Fi with respect to feature. F/\ 
and Wj denotes the relative importance of feature Ft. 
Those skilled in art will recognize that this similarity 
scheme can be generalized to a multi-level hierarchical 
image representation scheme. Furthermore, other 
methods of aggregating D Fi 's to obtain Sim( ) are also 
possible. At the lowest level of the hierarchical repre- 
sentation, similarity is determined by the similarity of the 
attribute values. For example, the attribute sets used in 
our embodiment of color-based image representation 
can be compared using the following measures. The 
similarity of two color space segments represented by 
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" Mn(x.,y,) 
dv / Max{x jt y;) 



/=0 



H D {x,y)= £M/n(x,.,y,- 

/=o 



where x and y are the two respective distributions of the 
two color space segments under comparison, x,- and y, 
are the normalized frequency values. Those skilled in 

25 the art would recognize that similarity values obtained 
for different attribute level representations can be com- 
bined in various ways to obtain an overall similarity at 
the next level of the hierarchical representation. 
[0026] Those skilled in the art would also recognize 

30 that the image retrieval process is also dependent on 
the way the image representations are organized in the 
database. For example, if image representations are 
stored in flat files, then most likely every image in the 
database need to be compared with the query image to 

35 identify similar images. A more efficient way of selecting 
candidate similar images can be achieved by organizing • 
image representations in efficient indices. In our 
embodiment, multidimensional spatial and point access 
methods are used to organize image representations. 

40 For example, our color-based image representations 
can be organized using index structures based on 
bounding regions, color moments, and/or color distribu- 
tion. The approach adopted in preferred embodiment is 
to organize color-based image representations using 

45 only one of the representation components, and using 
the complete representation components for computing 
image similarities. For example, the bounding region 
component of the color-based representation of color 
space segments is used to organize al! the color space 

so segments belonging to database images in an R-tree 
like index structure. For this organization scheme, given 
a query image, the index is searched to identify color 
space segments with bounding regions overlapping the 
bounding regions of the query image color space seg- 

55 ments. Images associated with these overlapping seg- 
ments are considered to be candidate images. Color 
distribution values and color moment values are used to 
compute the overall similarity between the candidate 
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images and the query image. The computed similarity is 
used to rank the candidate image set. The ranked 
images are then displayed to the user S330. User can 
elect to save the candidate image set S332 for later use. 
Three options exist for saving the candidate image set 5 
S335: 

1. Cluster-based - This option treats the candidate 
set as a static image collection, and generates hier- 
archical clusters using various image representa- 10 
tion component-based similarities, as reported in 
[TP. Minka and R.W. Picard, "Interactive Learning 
with a Society of Models", Pattern Recognition, Vol. 

30, 1997, pp565-581]. These clusters are then 
stored in a storage medium. 15 

2. Rank-set-based - This option uses a set of differ- 
ent representation-based similarity measures to 
obtain the corresponding ranking of the images in 
the candidate image sets, as reported in [Y. Rui, S. 
Mehrotra, and M. Ortega, "A Relevance Feedback 20 
Architecture for Content-based Multimedia Informa- 
tion Retrieval Systems", IEEE Workshop on Con- 
tent-based Access of Image and Video Libraries, 
1997, pp82-89]. These ranked candidate sets are 
then stored in a storage medium. 25 

3. Matching-component-based - This option simply 
saves the candidate image set in a storage 
medium. In addition, for each candidate image rep- 
resentation component, the matching query com- 
ponent, and the associated match quality are also 30 
stored. 



[0027] One or more of these options can be selected 
at the time of system initialization. 

[0028] If the displayed candidate set meets the user's 35 
preference for desired images S337, then the process is 
terminated S338. Otherwise, user interactions are 
required to obtain examples (shown as + in Fig. 7) 
and/or counterexamples (shown as - in Fig. 7) of 
desired images S340. User provided examples and 40 
counterexamples are processed in conjunction with the 
saved candidate set to automatically infer user prefer- 
ences for the desired response S350. The user prefer- 
ence determination process is dependent on the option 
selected for saving the initial candidate image set. For 45 
cluster-based option. of saving the candidate set, exam- 
ples and counterexamples are used to identify repre- 
sentation components, and the associated segments of 
cluster hierarchy that contain example components but 
no counterexample components. A similar approach is so 
adopted in [TP. Minka and R.W. Picard, "Interactive 
Learning with a Society of Models", Pattern Recogni- 
tion, Vol. 30, 1997, pp565-581]. This information is used 
to derive the representations of desired images, that is, 
the images desired by the user are required to have the 55 
derived representations. For rank-set-based option of 
saving the candidate image set, the user is required to 
input the desired rankings of some example images. 



The user supplied rankings are used to identify the 
stored ranked candidate set in which the rankings of the 
example images most closely match the user provided 
rankings. This information is used to identify the pre- 
ferred representation and associated similarity measure 
from a fixed set of representations and similarity meas- 
ures (those that were used in generating the ranked 
candidate sets in step S335). A similar approach is 
adopted in [Y. Rui, S. Mehrotra, and M. Ortega, "A Rel- 
evance Feedback Architecture for Content-based Multi- 
media Information Retrieval Systems", IEEE Workshop 
on Content-based Access of Image and Video Libraries, 
1997, pp82-89]. For matched component based option 
of saving the candidate image set, the examples and 
counterexamples are used to derive the relative impor- 
tance of various representation components at all levels 
of the representation hierarchy. At any given level of rep- 
resentation hierarchy, the relative importance of a query 
image representation component is determined by the 
number of matching example and counterexample 
image components, and their matching quality. Every 
occurrence of a matching component in an example 
image increases the relative importance of that query 
image representation component in proportion to the 
quality of match. Similarly, every occurrence of a match- 
ing component in a counterexample image decreases 
the relative importance of that query image, representa- 
tion component in proportion to the quality of match. 
Those skilled in the art would recognize that other 
options for saving candidate sets and deriving user pref- 
erences can be employed within the scope of this inven- 
tion. 

[0029] The derived user preferences can be stored 
permanently or temporarily. If the user elects to store 
the derived preferences in his/her user profile S360, a 
preference record containing the query image represen- 
tation and the derived preferences are added, to a user 
selected preference set S370. Any duplicate preference 
record detection method can be used to avoid duplicate 
preferences records in a preference set. If the user does 
not elect to add the derived preferences to his/her user 
profile S360, the preference record which contains the 
current query image representation is retained for use in 
the current session S380. 

[0030] Non query-based option - This option of deter- 
mining user preferences does not involve any initial 
query processing. The main steps of the embodiment 
are shown in Fig. 8. The process starts by user interac- 
tion to obtain examples (shown as '+' in Fig. 8) and 
counterexamples (shown as '-' in Fig. 8) of desired 
images S41 0. User can specify actual image ID's for the 
example and counterexample images, or use random 
browser to select such images. These examples and 
counterexamples are then used to extract user prefer- 
ences S420. The preferred embodiment offers two 
options for determining user preferences: 



1. Cluster-based - This option is in principle the 
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same as the cluster-based option of saving the can- 
didate set S335 (Fig. 7) in the query-based user 
preference understanding process as described 
earlier For this option to be applicable, the image 
database must be static, and the hierarchical clus- 5 
ters must have been generated and stored at the 
end of the image registration process S130 (Fig. 2). 
User supplied examples and counterexamples are 
used to identify representation components, and 
the associated segments of cluster hierarchy that 10 
contain example components but no counterexam- 
ple components. A similar approach is adopted in 
[T.P Minka and R.W. Picard, "Interactive Learning 
with a Society of Models", Pattern Recognition, Vol. 
30, 1997, pp565-581]. This information is used to 75 
derive the representations of desired images, that 
is, the images desired by the user are required to 
have the derived representations. 
2. Matching component-based - This option is 
again in principle the same as the matching compo- 20 
nent-based option of saving the candidate set S335 
(Fig. 7) in the query-based user preference under- 
standing process as described earlier. A pseudo- 
query image representation is generated by the 
union of all the image representations of the user 25 
supplied example images. The user supplied exam- 
ples and counterexamples are used to derive the 
relative importance of various representation com- 
ponents at all levels of the representation hierarchy 
of the pseudo-query representation. At any given 30 
level of representation hierarchy, the relative impor- 
tance of a pseudo-query image representation 
component is determined by the number of match- 
ing example and counterexample image compo- 
nents, and their matching quality. Every occurrence 35 
of a matching component in an example image 
increases the relative importance of that pseudo- 
query image representation component in propor- 
tion to the quality of match. Similarly, every occur- 
rence of a matching component in a 40 
counterexample image decreases the relative 
importance of that pseudo-query image represen- 
tation component in proportion to the quality of 
match. 

45 

[0031] The derived user preferences can then be 
stored permanently or temporarily. If the user elects to 
store the derived preferences in his/her user profile 
S430, a preference record containing the pseudo-query 
image representation and the derived preferences are 50 
added to a user selected preference set S440. Any 
duplicate preference record detection method can be 
used to avoid duplicate preferences records in a prefer- 
ence set. If the user does not elect to add the, derived 
preferences to his/her user profile S430, the preference 55 
record is retained for use in the current session S450. 
[0032] The above process of determining user prefer- 
ences can be repeated as many times as desired to 



update or modify the preference records. 
[0033] The user preference based retrieval phase of 
the present system/framework allows the user to query 
for desired images using user specified preferences. 
Two embodiments for this functional phase are available 
in our system: 

1. Query image and preference combination - The 
embodiment of this type of retrieval is shown in Fig. 
9. For this type of retrieval, user specifies a query 
image S510, and a preference file S520. An auto- 
matic preference applicability check is performed 
S530 to determine whether this user-selected pref- 
erence is compatible with the user-selected query. 
The method used to perform the applicability check 
is again dependent on the preference understand- 
ing option used. For the cluster-based preferences, 
the query image must belong to one of the pre- 
ferred clusters. For the ranked set preferences, the 
components in the user selected query and the 
components in the saved preference file are com- 
pared. The number of components that are in the 
user-selected query must match a certain number 
of components in the preference file. The threshold 
number of matching components is adjustable and 
can be set at initial system installation time. The 
same applicability test is applied to the matching 
component-based preferences. If the preference 
file chosen by the user is not applicable to this par- 
ticular query, the user must select another prefer- 
ence file, or return to the user preference 
understanding phase to specify a preference. 
When an applicable preference is available, the 
next step in the process is to combine the informa- 
tion in the preference files with the user-selected 
query S540. In case of cluster-based preferences, 
this step is not needed. For the other two types of 
preferences, the components in the user-selected 
query are used for the actual querying, the prefer- 
ence files simply provides the weight information 
associated with these components. For matching 
component-based preferences, default weights are 
associated with the components in the user-sup- 
plied query that are not in the query/pseudo-query 
stored in the preference file. After an applicable 
preference file is selected and combined with the 1 
query representation, user selects an image collec- 
tion S550 which is used for image retrieval. The 
image collection can be one of the previously 
stored candidate image sets, or the entire image 
database. A response set is generated S560 from 
the selected image collection set using the prefer- 
ence-query combination. The process of generat- 
ing the response set is dependent on the 
preference understanding option. For the cluster- 
based preference understanding, the images 
belonging to the clusters identified in the preference 
file form the response set. Optionally, ranking can 
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be generated for the images in the response set 
using any representation-based similarity measure. 
For the ranked set-based preference understand- 
ing, the preferred representation scheme is used to 
identify the desired images from the selected image 
collection. The identified images are then ranked 
using the preferred similarity measure. For the 
matching component-based preferences, the 
selected query image representation is used to 
identify desired images from the selected image 
collection. The preferred weights for the represen- 
tation components are then used to determine the 
overall similarity of desired images to the query 
image and their rankings. The desired images 
along with their rankings form the response set to 
the query. This response set is then displayed to the 
user S570. 

2 preference only retrieval - The embodiment of 
this type of retrieval is shown in Fig. 10. User 
selects a preference set S610, and also an image 
collection S620. The image collection can be one of 
the previously stored candidate image sets, or the 
entire image database. A response set is generated 
S630 from the selected image collection set using 
the selected preferences. The process of generat- 
ing the response set is dependent on the prefer- 
ence understanding option. For this type of 
retrieval, only the cluster-based or matching com- 
ponent-based preferences can be used. For the 
cluster-based preference understanding, the 
images belonging to the clusters identified in the 
preference file form the response set. Optionally, 
ranking can be generated for the images in the 
response set using any representation-based simi- 
larity measure. For the matching component-based 
preferences, the query or pseudo-query image rep- 
resentation is used to perform image similarity 
based retrieval from the selected image collection. 
The selected images are then ranked using a repre- 
sentation based similarity measure using the rela- 
tive importance values in the preference file. This 
response set is then displayed to the user S640. 

[0034] The process of understanding user preference 
and applying that for preference-based query process- 
ing can be repeated until the desired response to the 
query is obtained. 

[0035] In summary, this invention relates generally to 
the field of digital image processing and digital image 
understanding, and more particular to subjective image 
content similarity-based retrieval. With reference to Fig. 
11, the user supplied example(s) and/or counterexam- 
ples) of desired digital images 10 are digitally proc- 
essed 20 to determine the user's relative preference for 
different image components and/or depictive features. 
The result 30 obtained from processing step 20, which 
is the user's subjective definition of a desired image, is 
used in an operation step 40 wherein the retrieval of the 



desired images from a database is performed based 
upon the result 30. 

[0036] Other aspects of the invention include: 

5 1 . The method wherein step (c) of formulating the 

definition of a desired image includes a list of either 
desired or undesired depictive features of either an 
image component or an image and their relative 
preferences to the user. 
10 2. The method comprising the step of storing the 
user subjective definition of image similarity 
3. A method for retrieving user desired images com- 
prising the steps of 

75 (a) formulating a user subjective definition of a 

desired image using the relative preferences 
for either image components or depictive fea- 
tures from either example or counterexample of 
a desired image comprising; 

20 

(a1) extracting a relative preference of a 
user for either one or more image compo- 
nents or one or more depictive features 
from the examples and/or . counterexam- 
25 pies of desired images; and 

(b) applying the user subjective definition of a 
desired image to identify and retrieve the user 
desired images. 

4. The method wherein step (a) further comprises 
identifying similar and dissimilar image components 
among examples and/or counterexamples of a 
desired image. 

35 5. The method wherein an image component can 
be. either an entire image or a segment of the 
image. 

6: The method wherein step (a) further comprises 
. identifying similar and dissimilar depictive features 
40 of either image components or image among the 
examples and counterexamples of a desired image. 
7. The method wherein depictive features include 
color, color composition, texture, structure, and 
shape. 

45 8. The method wherein step (a) further comprises 
computing the relative preference for an image . 
component based on a frequency of occurrence of 
the image component in the examples and counter 
examples of a desired image. 
so 9. The method wherein step (a) further comprises 
computing the relative preference for an image 
component based on a similarity of the image com- 
ponents among examples and counter examples of 
a desired image. 
55 10. The method wherein step (a) further comprises 
computing the relative preference for a depictive 
feature based on a frequency of occurrence of the 
feature in the examples and counter examples of a 
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desired image. 

1 1 . The method wherein step (a) further comprises 
computing the relative preference for a depictive 
feature based on a similarity of the depictive feature 
among examples and counter examples of a 5 
desired image. 

12. The method wherein step (a) of formulating the 
definition of a desired image includes a list of either 
desired or undesired image components and their 
relative preferences to the user. 10 

13. The method wherein step (a) of formulating the 
definition of a desired image includes a list of either 
desired or undesired depictive features of either an 
image component or an image and their relative 
preferences to the user. 1 $ 

14. The method wherein step (b) further comprises 
selecting images containing either one or more 
desired image components or one or more desired 
depictive features from a database. 

15. The method wherein step (b) comprises apply- 20 
ing the user subjective definition of a desired image 

to a user specified query image to retrieve desired 
images. 

16. The method wherein step (b) comprises apply- 
ing a subset of the user subjective definition of a 25 
desired image consisting of the relative preferences 

for either the image components or depictive fea- 
tures present in the query image. 

17. The method wherein step (b) further comprising 

the step of ranking retrieved images based on the 30 
relative preference of either the image components 
or the depictive features. 

18. The method further comprising the step of dis- 
playing the retrieved images. 

19. A computer program product for learning a user 35 
preference for a desired image, comprising a com- 
puter readable storage medium having a computer 
program stored thereon for performing the steps of: 

(a) using one or more either examples or coun- 40 
terexamples of a desired image for defining a 
user preference; 

b) extracting a relative preference of a user for 
either one or more image components or one 
or more depictive features from the examples 45 
and/or counterexamples of desired images; 
and 

(c) formulating a user subjective definition of a 
desired image using the relative preferences 
for either image components or depictive fea- so 
tures. 

20. The computer program product wherein step (b) 
further comprises identifying similar and dissimilar 
image components among examples and/or coun- 55 
terexamples of a desired image. 

21. The computer program product wherein an 
image component can be either an entire image or 



a segment of the image. 

22. The computer program product wherein step (b) 
further comprises identifying similar and dissimilar 
depictive features of either the image components 
or the image among the examples and counterex- 
amples of a desired image. 

23. The computer program product wherein depic- 
tive features include color, color composition, tex- 
ture, structure, and shape. 

24. The computer program product wherein step (b) 
further comprises computing the relative prefer- 
ence for an image component based on a fre- 
quency of occurrence of the image component in 
the examples and counter examples of a desired 
image. 

25. The computer program product wherein step (b) 
further comprises computing the relative prefer- 
ence for an image component based on a similarity 
of the image components among examples and 
counter examples of a desired image. 

26. The computer program product wherein step (b) 
further comprises computing the relative prefer- 
ence for a depictive feature based on a frequency of 
occurrence of the feature in the examples and 
counter examples of a desired image. 

27. The computer program product wherein step (b) 
further comprises computing the relative prefer- 
ence for a depictive feature based on a similarity of 
the depictive feature among examples and counter 
examples of a desired image. 

28. The computer program product wherein step (c) 
of formulating the definition of a desired image 
includes a list of either desired or undesired image 
components and their relative preferences to the 
user. 

29. The computer program product wherein step (c) 
of formulating the definition of a desired image 
includes a list of either desired or undesired depic- 
tive features of either an image component or an 
image and their relative preferences to the user. 

30. The computer program product further compris- 
ing the step of storing the user subjective definition 
of image similarity 

31 . A computer program product for retrieving user 
desired images comprising the steps of 

(a) formulating a user subjective definition of a 
desired image using the relative preferences 
for either image components or depictive fea- 
tures from either example or counterexample of 
a desired image comprising; 

(a1) extracting a relative preference of a 
user for either one or more image compo- 
nents or one or more depictive features 
from the examples and/or counterexam- 
ples of desired images; and 
(b) applying the user subjective definition 
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of a desired image to identify and retrieve 
the user desired images. 

32. The computer program product wherein step (a) 
further comprises identifying similar and dissimilar 
image components among examples and/or coun- 
terexamples of a desired image. 

33. The computer program product wherein an 
image component can be either an entire image or 
a segment of the image. 

34. The computer program product wherein step (a) 
further comprises identifying similar and dissimilar 
depictive features of either image components or 
image among the examples and counterexamples 
of a desired image. 

35. The computer program product wherein depic- 
tive features include color, color composition, tex- 
ture, structure, and shape. 

36. The computer program product wherein step (a) 
further comprises computing the relative prefer- 
ence for an image component based on a fre- 
quency of occurrence of the image component in 
the -examples and counter examples of a desired 
image. 

37. The computer program product wherein step (a) 
further comprises computing the relative prefer- 
ence for an image component based on a similarity 
of the image components among examples and 
counter examples of a desired image. 

38. The computer program product wherein step (a) 
further comprises computing the relative prefer- 
ence for a depictive feature based on a frequency of 
occurrence of the feature in the examples and 
counter examples of a desired image. 

39. The computer program product wherein step (a) 
further comprises computing the relative prefer- 
ence for a depictive feature based on a similarity of 
the depictive feature among examples and counter 
examples of a desired image. 

40. The computer program product wherein step (a) 
of formulating the definition of a desired image 
includes a list of either desired or undesired image 
components and their relative preferences to the 

user. .' 

41 . The computer program product wherein step (a) 
of formulating the definition of a desired image 
includes a list of either desired or undesired depic- 
tive features of either an image component or an 
image and their relative preferences to the user. 

42. The computer program product wherein step (b) 
further comprises selecting images containing 
either one or more desired image components or 
one or more desired depictive features from a data- 
base. 

43. The computer program product wherein step (b) 
comprises applying the user subjective definition of 
a desired image to a user specified query image to 
retrieve desired images. 
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44. The computer program product wherein step (b) 
comprises applying a subset of the user subjective 
definition of a desired image consisting of the rela- 
tive preferences for either the image components or 
depictive features present in the query image. 

45. The computer program product wherein step (b) 
further comprising the step of ranking retrieved 
images based on the relative preference of either 
the image components or the depictive features. 

46. The computer program product further compris- 
ing the step of displaying the retrieved images. 



Claims 

1 5 1. A method for learning a user preference for a 
desired image, the method comprising the steps of: 

(a) using either one or more examples or coun- 
terexamples of a desired image for defining a 

20 user preference; 

(b) extracting a relative preference of a user for 
either one or more image components or one 
or more depictive features from the examples 
and/or counterexamples of desired images; 

25 and 

(c) formulating a user subjective definition of a 
desired image using the relative preferences 
for either image components or depictive fea- 
tures. 

30 

2. The method as in claim 1, wherein step (b) further 
comprises identifying similar and dissimilar image 
components among examples and/or counterex- 
amples of a desired image. 

3. The method as in claim 2, wherein an image com- 
ponent can be either an entire image or a segment 
of the image. 

4. The method as in claim 1, wherein step (b) further 
comprises identifying similar and dissimilar depic- 
tive features of either the image components or the 
image among the examples and counterexamples 
of a desired image. 

5. The method as in claim 4, wherein depictive fea- 
tures include color, color composition, texture, 
structure, and shape. 

so 6, The method as in claim 1 , wherein step (b) further 
comprises computing the relative preference for an 
image component based on a frequency of occur- 
rence of the image component in the examples and 
counter examples of a desired image. 

55 

7. The method as in claim 1. wherein step (b) further 
comprises computing the relative preference for an 
image component based on a similarity of the 
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image components among examples and counter 
examples of a desired image. 

The method as in claim 1, wherein step (b) further 
comprises computing the relative preference for a 5 
depictive feature based on a frequency of occur- 
rence of the feature in the examples and counter 
examples of a desired image. 

The method as in claim 1 , wherein step (b) further w . 
comprises computing the relative preference for a 
depictive feature based on a similarity of the depic- 
tive feature among examples and counter examples 
of a desired image. 



10. The method as in claim 1, wherein step (c) of for- 
mulating the definition of a desired image includes 
a list of either desired or undesired image compo- 
nents and their relative preferences to the user. 
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